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Abstract 

POCP is a new Matlab package running jointly with GloptiPoly 3 and, option- 
ally, YALMIP. It is aimed at nonlinear optimal control problems for which all the 
problem data are polynomial, and provides an approximation of the optimal value 
as well as some control policy. Thanks to a user-friendly interface, POCP reformu- 
lates such control problems as generalized problems of moments, in turn converted 
by GloptiPoly 3 into a hierarchy of semidefinite programming problems whose asso- 
ciated sequence of optimal values converges to the optimal value of the polynomial 
optimal control problem. In this paper we describe the basic features of POCP and 
illustrate them with some numerical examples. 



1 What is POCP? 



POCP is a Matlab package aimed at solving approximately nonlinear optimal control 
problems (OCPs) for which all the problem data are polynomial. Consider a continuous- 
time system described by the differential equation 

x{t) = f{t,x{t),u{t)), (1) 
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where x G M" and u G M"* are the state and input vectors, respectively. Given the running 
cost /i : R X M" X — > M and the final cost if : ^ M, the total cost to be minimized 
in the OCP is defined as 

J(0,T,x(0),m(-)) = / h{t,x{t),u{t))dt + H{x{T)), (2) 
Jo 

where T is the horizon length. Different OCPs can be formulated depending if the initial 
condition a;(0) has been assigned or not. 

1.1 Optimal control problem with free initial condition 

Consider the following constraints: 

• x{0) eCj = {x : gi^{x) < 0, j = 1, . . .,ni}; 

• x(T) eCp = {x : gF,{x) < 0, j = 1, . . . ,nir}; 

• it^x{t),u{t)) eCT = {it,x,u) : gT^{t,x,u) < 0, j = I, . . . 



An OCP with free initial condition is of the form 

min J(0,T,x(0),u(t)) 

x{0),u{t) 

x(0) G Cj (3) 
{t,x{t),u{t)) G Ct 
x{T) G Cf 

To solve this problem using POCP all the data must be polynomial. More specifically, /, 
h, H, gi , gT., and gp. must be polynomial functions. 



1.2 Optimal control problem with fixed initial condition 

When set Cj contains only one point, the initial condition is fixed. From this viewpoint, the 
minimization problem ([3]) can be used also to formulate OCPs where the initial condition 
is fixed. 

However, it can be interesting to consider a more general framework where the initial 
condition is not known exactly, but stochastically. In this case, the probability measure 
of the initial condition fij is given, and we consider the following OCP 

min f J{0,T, XQ,u{xQ,t))diii{xo) 
"(■) 

{t,x{t),uit)) eCt ' (4) 

x{T) G Cp. 
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To better understand how to use this formulation, see [T]. 

POCP can also deal with problems where some state variables are fixed at time t = and 
some of them can be chosen inside a set Cj. 

POCP can be used for problems where the horizon T is fixed or not. 



2 How does POCP work and what does it calculate? 

A full explanation of the theory on which POCP is based is out of scope for this user's 
guide. The interested reader is referred to [1] and |2]. 

The techniques implemented in POCP are based on a modeling via occupation measures 
associated with semidefinite programming (SDP) relaxations. POCP formulates a hier- 
archy of relaxations giving a sequence of lower bounds on the optimal value of the OCPs 
([3]) and (jl]). Furthermore, POCP returns the moment matrices of the occupation mea- 
sures used and, as a byproduct, a polynomial subsolution of the Hamilton- Jacobi-Bellman 
(HJB) equation. 



3 Installation 

POCP is a free Matlab package consisting of an archive file downloadable from 
www . laas . f r/~henrion/ sof tware/pocp 
The installation consists of two steps: 

• extract the directory @pocp from the archive file, 

• copy Opocp on your working directory or, using the command addpath, add the 
parent directory of Opocp to the Matlab path. 

POCP is based on GloptiPoly version 3.3 or higher |3]. Therefore it is assumed that this 
package is properly installed. To compute polynomial subsolutions of the HJB equation, 
the optimization modelling toolbox YALMIP |3] must be installed as well. 
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4 An introductory example 



Consider the double integrator 

'xiit) 

with state constraint X2{t) > — 1 and input constraint —l<u{t)<l. When we consider 
the problem of steering in minimum time the state to the origin, an analytic solution to 
this problem is available [2]. When x(0) = [1 1]-^, the value of the minimum time to reach 
the origin is T = 3.5. In the following, we show a Matlab POCP script which defines the 
problem and calculates a lower bound on the optimal value. 

% system variables 

mpol X 2; % state vector of size 2 

mpol u; % scalar input 

% state and input matrices 
A = [0 1; 0]; 
B = [0; 1] ; 

% define optimal control problem 
prob = pocp( . . . 

'state' , X, ... 

' input ' , u , ... 

' dynamics ' , A*x+B*u) ; 

% set further properties 

prob = set (prob, 'idirac', x, [1; 1], ... % initial condition 
'fdirac', x, [0; 0], ... % final condition 
'tconstraint ' , [x(2)>=-l; u>=-l; u<=l] , ... /(, constraints 
'scost', 1); °/o setting integral cost h to 1 

% call to the solver 

[status, cost] = solvepocpCprob, 14); % 14 = degree of moment matrix 

dispC'The lower bound on the minimum time is'); 
disp(cost) ; 

We can notice that the user is required to know only four commands: mpol (a GloptiPoly 
command to define variables), pocp, set and solvepocp. When the script file is executed, 
we obtain the following output 





1 




Xl{t) 


+ 












X2{t) 


1 
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The lower bound on the minimum time is 
3.4988 

In the following section we illustrate all POCP commands together with examples of their 
usage. 



5 Command guide 

POCP consists of three commands: pocp, set, and solvepocp. 



5.1 The pocp command 

Short description: Creates a new POCP object. 
SyntELX and usage: When used in the form 

» prob = pocp; 

the command pocp creates an empty POCP object prob of class pocp. The problem data 
can be then specified using the command set. 

The pocp command can also be used to create a POCP object and, at the same time, set 
the problem data. In this case, the syntax is 

» prob = pocp( 'Propertyl ' , Argumentl, Argument2, . . . , 'Property2\ ...) 

The properties and the arguments should be specified accordingly to the syntax of the 
set command, see below. 



5.2 The set command 

Short description: Sets the data of a POCP. 

SynteLX and usage: Specifies one or more properties of the POCP object. For example, 
to set a property which requires two arguments, use 

» prob = set (prob, 'Propertyl', Argumentl, Argument2) ; 



To set more properties at once, 



» prob = setCprob, 'Propertyl', Argument 1, 'Property2' , ...); 

To help the user remember the name of the properties the following convention has been 
used: "i" refers to the initial condition, "f " to the final condition, "t" to the trajectory, 
and "s" (for sum) to an integral. 

In the sequel we list and describe all the properties that can be specified and their argu- 
ments. 



5.2.1 Setting the state variables 

Property name: state. 
Number of arguments: 1. 
Default value: empty. 

Description: Specifies state variables, a vector of GloptiPoly class mpol. 
Example 

In this example we declare a variable x of dimension 2 and, after creating a POCP object 
prob, we set x as the state variable. 

» mpol X 2; 
» prob = pocp; 

» prob = setCprob, 'state', x) ; 

State variables can also have different names as in the following example 

» mpol xl x2; 
» prob = pocp; 

» prob = setCprob, 'state', [xl; x2] ) ; 

For more information on declaring GloptiPoly variables type 

» help mpol 

or check [5]. 



5.2.2 Setting the input variables 



Property name: input. 
Number of arguments: 1. 
Default: empty. 

Description: Specifies input variables, a vector of GloptiPoly class mpol. 
Example 

Suppose a POCP object prob has been already defined. With the following commands a 
scalar mpol variable u is defined and then set as the input variable. 

» mpol u; 

» prob = set (prob , ' input ' , u) ; 



5.2.3 Setting the time variable 

Property name: time. 
Number of arguments: 1. 
Default: empty. 

Description: Specifies the time variable, a scalar of GloptiPoly class mpol. When the 
time variable is not specified but is needed internally by POCP, then it is defined auto- 
matically. 
Example 

Suppose a POCP object prob has already been defined. With the following commands a 
scalar mpol variable t is defined and then set as the time variable. 

» mpol t; 

» prob = setCprob, 'time', t) ; 



5.2.4 Setting the dynamics 

Property name: dynamics. 
Number of arguments: 1. 
Default: empty. 

Description: Specifies the system dynamics, a vector of GloptiPoly class mpol of the 

same size as the state vector. 

Example 

Assume a POCP object prob has been created and the state vector x and input variable 
u have been set. In this example we set the linear dynamics specified by the state matrix 
A and input matrix B. 

» A = [0 1; 1 1] ; 
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» B = [0; 1] ; 

» prob = setCprob, 'dynamics', A*x+B*u) ; 



5.2.5 Setting the horizon length 

Property name: horizon. 
Number of arguments: 1. 
Default: 0. 

Description: Specifies the horizon length, a non-negative number. If set to 0, the horizon 

is free. 

Example 

Assume the POCP object prob has already been created. With the following command 
the horizon length is set to 1. 

» prob = setCprob, 'horizon', 1); 



5.2.6 Setting the final cost 

Property name: f cost. 
Number of arguments: 1. 
Default: 0. 

Description: Specifies the final cost, a polynomial of GloptiPoly class mpol whose vari- 
ables are elements of the state vector. If both the final cost and the integral cost are 0, 
then POCP minimizes the trace of the moment matrix of the occupation measure of the 
trajectory. 
Example 

Assume a POCP object prob has already been created and x(l) is one of the state 
variables. The following command sets the final cost to x(l)''2. 

» prob = setCprob, 'fcost', x(l)~2); 



5.2.7 Setting the integral cost 

Property name: scost. 
Number of arguments: 1. 
Default: 0. 

Description: Specifies the integral cost, a polynomial of GloptiPoly class mpol whose 
variables are the time variable and elements of the state and input vectors. For minimum 
time optimal control problems, set scost to 1 and set horizon to 0. If both the final cost 
and the integral cost are 0, then POCP minimizes the trace of the moment matrix of the 
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occupation measure of the trajectory. 
Example 

Assume a POCP object prob has aheady been created and the state and input vectors are 
X and u, respectively. With the following command the integral cost is set to x ' *x+u ' *u. 

» prob = set (prob, 'scost', x'*x+u'*u); 

5.2.8 Setting the dependence on time of the test functions 

Property name: testtime. 
Number of arguments: 1. 
Default: empty. 

Description: When the system dynamics does not depend on time and the horizon is 
free the test function does not depend on time. In the other cases the test function de- 
pends on time. However, this default behavior can be modified by setting the property 
testtime. When the argument is true (false) the test functions will (not) depend on 
time. To restore the default behavior, specify an empty ( [] ) argument. 
Example 

Assume a POCP object prob has already been defined. To make the test function depend 
also on time, use the following command 



» prob = setCprob, 'testtime', true); 

5.2.9 Setting equality and inequality constraints on the initial condition 

Property name: iconstraint. 
Number of arguments: 1. 
Default: empty. 

Description: Specifices equality and inequality constraints on the initial condition, a 

column vector of GloptiPoly class supcon. 

Example 

Assume the POCP object prob has already been created and x(l) and x(2) are the state 
variables. With the following command, both variables are constrained to be less than or 
equal to 1. 

» prob = setCprob, iconstraint, [x(l)<=l; x(2)<=l]); 

If the sum of x(l) and x(2) must be equal to 1, the following command should be used 
» prob = setCprob, iconstraint, xCl)+xC2)==l) ; 



5.2.10 Setting the initial condition to a Dirac delta 



Property name: idirac. 
Number of arguments: 2 or 3. 

Default: unspecified, considered as a problem unknown. 

Description: This property should be set when one or more state variables take their 
value at time accordingly to a distribution given by the sum of Dirac deltas. Two 
or three arguments can be specified. The first argument is a column vector containing 
the state variables involved. The second is a matrix whose columns contain the value of 
the variables (one column for each Dirac delta) . The third argument (optional when the 
second argument has only one column) is a row vector specifying the probability of each 
Dirac delta. 
Example 

Assume a POCP object prob has already been created and x(l) and x(2) are the state 
variables. With the following command we set the initial condition to a;(0) = (1, 1) 

» prob = setCprob, 'idirac', [x(l) ; x(2)] , [1; 1]); 

If the value at time is x{0) — (0, 1) with probability 0.8 and a:(0) = (1,1) with probability 
0.2, we should use 

» prob = setCprob, 'idirac', x, [0 1; 11]; [0.8 0.2]); 



5.2.11 Setting the initial condition to a uniform distribution on a box 

Property name: iunif orm. 
Number of arguments: 2. 

Default: unspecified, considered as a problem unknown. 

Description: This property should be set when one or more state varibles take their 
value at time accordingly to a uniform distribution on a box. Two arguments must be 
specified. The first argument is a column vector containing the state variables involved. 
The second is a matrix with two columns and the number of rows equal to the length of 
the first argument, specifying the intervals. 
Example 

Assume a POCP object prob has already been created and x(l) and x(2) are the state 
variables. With the following command we set the value of x(l) to be uniformly dis- 
tributed on the interval [—1, 1] and the value of x(2) on the interval [0, 2] 

» prob = setCprob, 'iunif orm', [xCD ; xC2)] , [-1 1; 2]); 

The properties idirac and iunif orm can also be used in the same problem, like in the 
following 



in 



» prob = setCprob, 'iuniform', x(l) , [-1 1] ) ; 
» prob = setCprob, 'idirac', x(2), -1); 



5.2.12 Setting equality and inequality constraints on the final condition 

Property name: f constraint. 
Number of arguments: 1. 
Default: empty. 

Description: Specifies equality and inequality constraints on the state variables at the 
end of the horizon. The usage is the same as for the property iconstraint. 

5.2.13 Setting the final condition to a Dirac delta 

Property name: f dirac. 
Number of arguments: 2 or 3. 

Default: unspecified, considered as a problem unknown. 

Description: This property should be set when one or more state variables take their 
value at the end of the horizon accordingly to a distribution given by the sum of Dirac 
deltas. The usage is the same as for the property idirac. 



5.2.14 Setting equality and inequality constraints on the variables along the 
trajectory 

Property name: tconstraint 
Number of arguments: 1. 
Default: empty. 

Description: Specifies equality and inequality constraints on the system variables along 

the trajectory, a column vector of GloptiPoly class supcon. 

Example 

Assume a POCP object prob has already been created and x is the scalar state, u is the 
input, and t is the time variable. With the following command we set the constraints 
x>=0 and x+u+t<=l 

» prob = setCprob, 'tconstraint', [x>=0; x+u+t<=l] ) ; 



5.2.15 Setting integral constraints 

Property name: sconstraint. 
Number of arguments: 1. 

1 1 



Default: empty. 

Description: Specifices constraints on the integral of some variables, a column vector of 

GloptiPoly class momcon. 

Example 

Assume a POCP object prob has already been created and u is the input variable. To 
enforce the integral constraint u{tYdt < 1, use the following command 

» prob = setCprob, ' sconstraint ' , mom(u"2)<=l) ; 

5.2.16 Some remcirks on the usage of the set command 

• The system variables (state vector, input vector, and time) must be set before they 
are used. Constraints on the initial state cannot be entered if the state vector has 
not been set. When more properties are specified with the same set command, the 
system variables should be specified first. 

• Once the system variables (state vector, input vector, and time) have been specified, 
they cannot be modified any further. 

• The current version of the software does not check for conflicts between different 
kinds of constraints. E.g., if the value at time for a state variable is assigned by 
setting the property idirac and then the same variable appears in the constraints 
set by iconstraint, the behavior of POCP is unpredictable. 



5.3 The solvepocp command 

Short description: Builds and solves an SDP relaxation of a POCP. 
SyntELX and usage: When used in the form 

» [status, cost, measures] = solvepocp (prob, degree); 

or, equivalently, 

» [status, cost, measures] = solvepocp (prob, 'mom', degree); 

the command solvepocp builds and solves an SDP relaxation where moments of degree 
up to degree are used. The input argument 'mom' stands for moments. The output 
parameters have the following meaning: 

• When status is strictly negative, the SDP problem is infeasible or could not be 
solved. Otherwise, the SDP problem could be solved; 

-\9. 



• cost is a lower bound on the objective function of the POCP if status is non- 
negative; 

• measures is a structure containing the measures used by GloptiPoly to solve the 
relaxation of the POCP. Once the relaxation is solved, measures contains fields 
measures . initial, measures . final, and measures .trajectory. If the initial or 
final conditions have been assigned, the corresponding measures are empty ( [] ). To 
extract the moment matrix corresponding to one of the measures, use the GloptiPoly 
function mmat in combination with double. For example, to retrieve the moment 
matrix corresponding to the trajectory occupation measure, use the following com- 
mand 

>> double (mmat (measures . traj ectory) ) ; 

It is also possible to call solvepocp by specifying the degree of the test function to be 
used when defining the moment problem: 

» [status, cost, measures] = solvepocp (prob, 'tf , degree); 

The input argument 'tf stands for test function. As a rule of thumb, degree should be 
at least twice the degree of the dynamics. Increasing the value of degree improves the 
quality of the approximation, but it also increases the size of the SDP problem solved. 

The command solvepocp can also be called by specifying four output parameters: 

» [status, cost, measures, vf] = solvepocp (prob, degree); 

Using this syntax, a subsolution of the HJB equation is calculated and stored in vf (for 
value function). This approximation is a polynomial of GloptiPoly class mpol. If the 
option 'tf ' is used, degree corresponds to the degree of the subsolution of the HJB 
equation. The subsolution vf is computed by using the YALMIP toolbox, which must be 
installed properly. 



6 First example 

In this section we show how POCP can be used to design a suboptimal controller. Consider 
as in [T] the nonlinear system 



Xi(t) 








X2{t)_ 






u{t) 



The optimal control problem consists in driving to the origin the initial states from the 
set {xi,X2) G SS = [—1,1] X [—1,1] by minimizing the cost functional h{x{t),u{t))dt 
with ^ 

h{x{t),u{t)) = Xi{tf + X2{tf + 

The terminal time T is free. 

To solve this problem and design a control law, we can define a POCP problem and use 
the command solvepocp with four output arguments. By setting the initial condition 
to be uniformly distributed on the box SS, the fourth output argument of solvepocp 
contains a polynomial subsolution of the HJB equation which approximates the value 
function along all the optimal trajectories starting from SS, see [I]. Therefore, a natural 
choice to derive a control law consists in solving the following minimization problem 

'dVf{x,t) 



mm 



+ VxVf{x)f{x, u) + h{x, u) 



(5) 



dt 

where V} represents the approximated value function. Since the dynamics is affine in u 
and hi) is quadratic in u, the global minimizer of ([5]) can be calculated by using first order 
optimality conditions: 

:= -50V^(/^(x) [0 l]^. 

In the sequel we show a Matlab POCP script which calculates a polynomial approximation 
of the value function vf and defines the control law u_x. 

% variable declaration 
mpol x 2 
mpol u 

% problem definition 
prob = pocp( . . . 

'state' , X, ... 

' input ' , u , ... 

'dynamics', [x(2)+x(l) "2-x(l) "3; u] , ... 

'horizon' , , ... 

'iuniform', x, [-1 1; -1 1], ... 

'fd' , X, [0;0] , . . . 

'scost', x'*x+u"2/100) ; 

% problem solved with test function up to degree 8 
[status, cost, measures, vf] = solvepocp (prob, 'tf, 8); 

°/„ control law definition 

u_x = -50 * diffCvf, x)*[0; 1]; 

The value of u_x can be easily evaluated by using the GloptiPoly commands assign and 
double. For example, to obtain a value for the control at x = (0.5, 0.5), the following 
commands can be used. 
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» assign([x; u] , [0.5 0.5 0]'); 
» controlvalue = double (u_x) ; 

Notice that when we assign the value of x we must assign also the value of u (any 
value can be used). For more information on the commands assign and double, see the 
documentation of GloptiPoly. 

Figure [1] shows some trajectories starting from SS and obtained by simulating the system 
with the calculated control law. 



0.5 - 



C\J n 



-0.5 



-1 - 




-1 -0.5 0.5 1 



"1 

Figure 1: Trajectories obtained by applying the control law u_x. 



7 Second example 

This example is taken from [5^, Example 3.27]. For the nonlinear system 







'-Xi{tf +xi{t)u{t) 






u{t) 



and the cost functional J^{x2(t)^ + u(t)'^)dt, the solution of the HJB equation is X2{ty, 
which is only positive semidefinite. Since the solution is polynomial, we can show with 
this example that the solution can be found exactly (and only using test functions up to 
degree 2): 

mpol x 2 

1.^ 



mpol u 



p=pocp() ; 

p=set(p, 'state' ,x) ; 
p=set (p , ' input ' , u) ; 

p=set (p, 'dynamics ' , [-x(l) "3+x(l) *u; u] ) ; 
p=set (p, ' scost ' ,x(2) '■2+u"2) ; 
p=set(p, 'idirac' ,x, [1; 1]); 
p=set(p, 'fdirac' ,x, [0; 0]); 
p=set(p, 'tconst' , [x>=-l.l; x<=l.l]); 

[status, cost .measures, vf] = ... 
solvepocp (p , ' tf ' , 2) ; 

dispCSubsolution of HJB equation='); 
vf 

Running the above script we obtain the following output: 

Scalar polynomial 
x(2)''2 



8 Variable scaling 

To improve the numerical behavior of the SDP solver used by POCP, the variables (state, 
input and time) could be scaled. Scaling can be achieved by using the GloptiPoly com- 
mand scale. For example, to substitute the variable x with 2*x, use the following com- 
mands 

» mpol X 

» scale (x, 2) 

Scaling changes only the internal representation of the variables. Prom the user point of 
view nothing changes, and there is no need to scale the result of the optimization problem. 
For more information about the scale command, see the documentation of GloptiPoly. 
If possible, all the variables should be scaled within the interval [—1,-1-1]. 



9 Some ideas for new features 



We end this guide by listing ideas for improvements and new features of POCP: 



• To further extend the class of continuous-time problems considered, more flexibility 
on the initial and final time (currently and T) could be permitted. If the cost 

ftf 

J{ti,tf,x{0),u{-)) = h{t,x{t),u{t))dt + H{x{tf)). (6) 
Jti 

is considered, the problem 

min / J{ti,T,Xi,u{xi,t))d^i{U,Xi) 

u{-) 

{t,x{t),u{t)) eCr i^) 
x{T) e Cf 

could be modeled with POCP, with /ij an assigned probability measure on R x R"-. 
Setting /Zj to be a uniform measure on the set {(t, x) : t G [0, 1], Xj G [— 1, — 1] Vj = 
1, . . . , n}, we could obtain a subsolution of the HJB equation which gives a good 
approximation of the value function on a box centered at the origin of the state 
space and for all the values of t in the interval [0, 1], see pQ. 

• Version 3.3 of GloptiPoly cannot be used to retrieve information on the dual vari- 
ables of the moment problem. As a workaround to calculate a subsolution of the 
HJB equation, POCP currently uses GloptiPoly in combination with YALMIP. If 
GloptiPoly will include this feature in some future version, POCP will not require 
YALMIP. 

• GloptiPoly does not support currently the assignment of a measure (e.g., it cannot 
be specified that a variable is uniformly distributed on an interval). If GloptiPoly 
will include this feature in some future version, the POCP code could be significantly 
simplified. 
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